Abstract: Optical Character Recognition (OCR) is a computer system designed to transform images of typewritten text (typically captured by a scanner) into machine-processed text. Arabic OCR has been developed and enhanced over decades leading to the presence of enormous number of approaches with robust results that approaching accuracy, in some cases, of approximately 99%. However, existing OCRs exhibit shortages, or at least only sub-sets of them, when they were implemented with new applications, such as low-resolution inputs and video-based inputs. Accordingly, there is a need to review the existing approaches that showed robust results, analyses its mechanism and list its advantages and disadvantages in-order to ease the adaptation and extension of these systems into the new applications in this field. This paper presents a literature review on the existing systems for Arabic OCR, draw a common mechanism out of them, list the differences, advantages and disadvantages that helps in adaptation or extension of these systems to fit the recent demands.

Keywords: Optical Character Recognition, OCR, Pre-processing Operations, Segmentation, Optical Font Recognition.